This paper serves as a reference and introduction on using the \(gdpR\) R package. The goal of this package is to provide some tools for exploring the impact of different privacy regimes on a Bayesian analysis. A strength of this framework is the ability to target the exact posterior in settings where the likelihood is too complex to analytically express.
The ease and pervasiveness of modern data collection technologies has raised concerns about data privacy. (Dwork and Roth 2013) introduced the differential privacy framework as a means to rigorously define privacy. The framework has lead to the development of many ``privitized’’ versions of existing statistical methods. The process of privitizing usually consist of introducing random noise in someway using a known distribution.
This section reviews This will show a verbatim inline R expression `r 1+1` in the output.
Some packages on interactive graphics include plotly (Sievert 2020) that interfaces with Javascript for web-based interactive graphics, crosstalk (Cheng and Sievert 2021) that specializes cross-linking elements across individual graphics. The recent R Journal paper tsibbletalk (Wang and Cook 2021) provides a good example of including interactive graphics into an article for the journal. It has both a set of linked plots, and also an animated gif example, illustrating linking between time series plots and feature summaries.
ToOoOlTiPs is a packages for customizing tooltips in interactive graphics, it features these possibilities.
The palmerpenguins data (Horst et al. 2020) features three penguin species which has a lovely illustration by Alison Horst in Figure 1.
Figure 1: Artwork by @allison_horst
Table 1 prints at the first few rows of the penguins data:
| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year |
|---|---|---|---|---|---|---|---|
| Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | male | 2007 |
| Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | female | 2007 |
| Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | female | 2007 |
| Adelie | Torgersen | NA | NA | NA | NA | NA | 2007 |
| Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | female | 2007 |
| Adelie | Torgersen | 39.3 | 20.6 | 190 | 3650 | male | 2007 |
Figure 2 shows an interactive plot of the penguins data, made using the plotly package.
p <- penguins %>%
ggplot(aes(x = bill_depth_mm, y = bill_length_mm,
color = species)) +
geom_point()
ggplotly(p)
Figure 2: A basic interactive plot made with the plotly package on palmer penguin data. Three species of penguins are plotted with bill depth on the x-axis and bill length on the y-axis. When hovering on a point, a tooltip will show the exact value of the bill depth and length for that point, along with the species name.
We have displayed various tooltips that are available in the package ToOoOlTiPs.
plotly, crosstalk, tsibbletalk, palmerpenguins, ggplot2
Phylogenetics, Spatial, TeachingStatistics, TimeSeries, WebTechnologies
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Awan, et al., "gdpR: An R Package for studying differentially private algorithms", The R Journal, 2022
BibTeX citation
@article{dppaper,
author = {Awan, Jordan A. and Eng, Kevin and Gong, Robin and Ju, Nianqiao Phyllis and Rao, Vinayak A.},
title = {gdpR: An R Package for studying differentially private algorithms},
journal = {The R Journal},
year = {2022},
issn = {2073-4859},
pages = {1}
}